Penalized cluster analysis with applications to family data
نویسندگان
چکیده
Cluster analysis is the assignment of observations into clusters so that observations in the same cluster are similar in some sense, and many clustering methods have been developed. However, these methods cannot be applied to family data, which possess intrinsic familial structure. To take the familial structure into account, we propose a form of penalized cluster analysis with a tuning parameter controlling its influence. The tuning parameter can be selected based on the concept of clustering stability. The method can also be applied to other cluster data such as panel data. The method is illustrated via simulations and an application to a family study of asthma.
منابع مشابه
Penalized Estimators in Cox Regression Model
The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regressi...
متن کاملNon-hierarchical Clustering with Rival Penalized Competitive Learning for Information Retrieval
In large content-based image database applications, e cient information retrieval depends heavily on good indexing structures of the extracted features. While indexing techniques for text retrieval are well understood, e cient and robust indexing methodology for image retrieval is still in its infancy. In this paper, we present a non-hierarchical clustering scheme for index generation using the...
متن کاملNetwork-based clustering with mixtures of L1-penalized Gaussian graphical models: an empirical investigation
In many applications, multivariate samples may harbor previously unrecognized heterogeneity at the level of conditional independence or network structure. For example, in cancer biology, disease subtypes may differ with respect to subtype-specific interplay between molecular components. Then, both subtype discovery and estimation of subtype-specific networks present important and related challe...
متن کاملRival Penalization Controlled Competitive Learning for Data Clustering with Unknown Cluster Number
Conventional clustering algorithms such as k-means (Forgy 1965, MacQueen 1967) need to know the exact cluster number k∗ before performing data clustering. Otherwise, they will lead to a poor clustering performance. Unfortunately, it is often hard to determine k∗ in advance in many practical problems. Under the circumstances, Xu et al. in 1993 proposed an approach named Rival Penalized Competiti...
متن کاملComparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data
Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 55 شماره
صفحات -
تاریخ انتشار 2011